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We propose a novel distributed algorithm to cluster graphs. The algorithm recovers the solution obtained from spectral 
clustering without the need for expensive eigenvalue/ vector computations. We prove that, by propagating waves through the 
graph, a local fast Fourier transform yields the local component of every eigenvector of the Laplacian matrix, thus providing 
clustering information. For large graphs, the proposed algorithm is orders of magnitude faster than random walk based 
approaches. We prove the equivalence of the proposed algorithm to spectral clustering and derive convergence rates. We 
demonstrate the benefit of using this decentralized clustering algorithm for community detection in social graphs, accelerating 
distributed estimation in sensor networks and efficient computation of distributed multi-agent search strategies. 
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1 Introduction 

In recent years, there has been great interest in the anal- 
ysis of large interconnected systems, such as sensors net- 
works, social networks, the Internet, biochemical net- 
works, power networks, etc. These systems are charac- 
terized by complex behavior that arises due to interact- 
ing subsystems. For such systems graph theoretic meth- 
ods have recently been applied and extended to study 
these systems. In particular, spectral properties of the 
Laplacian matrix associated with such graphs provide 
useful information for the analysis and design of inter- 
connected systems. The computation of eigenvectors of 
the graph Laplacian is the cornerstone of spectral graph 
theory [1,2], and it is well known that the sign of the sec- 
ond (and successive eigenvectors) can be used to cluster 
graphs [3,4]. 

The problem of graph (or data, in general) clustering 
arises naturally in applications ranging from social an- 
thropology [5] , gene networks [6] , protein sequences [7] , 
sensor networks [8-10], computer graphics [11] and In- 
ternet routing algorithms [12]. 

The basic idea behind graph decomposition is to clus- 
ter nodes into groups with strong intra-connections but 
weak inter-connections. If one poses the clustering prob- 
lem as a minimization of the inter-connection strength 
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(sum of edge weights between clusters), it can be solved 
exactly and quickly [13] . However, the decomposition ob- 
tained is often unbalanced (some clusters are large and 
others small) [2]. To avoid unbalanced cuts, size restric- 
tions are typically placed on the clusters, i.e., instead of 
minimizing inter-connection strength, we minimize the 
ratio of the inter-connection strength to the size of indi- 
vidual clusters. This, however, makes the problem NP- 
complete [14] . Several heuristics to partition graphs have 
been developed over the last few decades [15] including 
the Kernighan-Lin algorithm [16], Potts method [17], 
percolation based methods [18], horizontal- vertical de- 
composition [19] and spectral clustering [3,4]. 

1.1 Spectral clustering 

Spectral clustering has emerged as a powerful tool of 
choice for graph decomposition purposes (see [2] and ref- 
erences therein). The method assigns nodes to clusters 
based on the signs of the elements of the eigenvectors 
of the Laplacian corresponding to increasing eigenval- 
ues [1,3,4]. In [20], the authors have developed a dis- 
tributed algorithm for spectral clustering of graphs. The 
algorithm involves performing random walks, and at ev- 
ery step neglecting probabilities below a threshold value. 
The nodes are then ordered by the ratio of probabilities 
to node degree and grouped into clusters. Since this al- 
gorithm is based on random walks, it suffers, in general, 
from slow convergence rates. 

Since the clustering assignment is computed using the 
eigenvectors/eigenvalues of the Laplacian matrix, one 
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can use standard matrix algorithms for such computa- 
tion [21]. However, as the size of the matrix (and thus 
the corresponding network) increases, the execution of 
these standard algorithms becomes infeasible on mono- 
lithic computing devices. To address this issue, algo- 
rithms for distributed eigenvector computations have 
been proposed [12]. These algorithms, however, are also 
(like the algorithm in [20] ) based on the slow process of 
performing random walks on graphs. 



1.2 Wave equation method 



In a theme similar to Mark Kac's question "Can one 
hear the shape of a drum?" [22], we demonstrate that 
by evolving the wave equation in the graph, nodes can 
"hear" the eigenvectors of the graph Laplacian using 
only local information. Moreover, we demonstrate, both 
theoretically and on examples, that the wave equation 
based algorithm is orders of magnitude faster than ran- 
dom walk based approaches for graphs with large mix- 
ing times. The overall idea of the wave equation based 
approach is to simulate, in a distributed fashion, the 
propagation of a wave through the graph and capture 
the frequencies at which the graph "resonates" . In this 
paper, we show that by using these frequencies one can 
compute the eigenvectors of the Laplacian, thus cluster- 
ing the graph. We also derive conditions that the wave 
must satisfy in order to cluster graphs using the pro- 
posed method. 

The paper is organized as follows: in Section 2 we de- 
scribe current methodologies for distributed eigenvec- 
tor/clustering computation based on the heat equation. 
In Section 3 the new proposed wave equation method 
is presented. In Section 4 we determine bounds on the 
convergence time of the wave equation. In Section 5 we 
show some numerical clustering results for a few graphs, 
including a large social network comprising of thousands 
of nodes and edges. We then show, in Section 6, how 
the wave equation can be used to accelerate distributed 
estimation in a large-scale environment such as a build- 
ing. In Section 7 we show how the proposed distributed 
clustering algorithm enables one to efficiently transform 
a centralized search algorithm into a decentralized one. 
Finally, conclusions are drawn in Section 8. 



2 From heat to wave equation: Related work 



Let Q = (V, E) be a graph with vertex set V = 
{l,...,iV} and edge set E C V x V, where a 
weight Wjj > is associated with each edge G E, 
and W is the N x N weighted adjacency matrix of Q. 
We assume that Wy = if and only if E. The 



(normalized) graph Laplacian is defined as, 

if i = j 

-Wy/E^LiWtf [{(i,j)€E (1) 







otherwise , 



or equivalently, L = I D 1 W where D is the diagonal 
matrix with the row sums of W. 

Note that in this work we only consider undirected 
graphs. The smallest eigenvalue of the Laplacian 
matrix is Ai = 0, with an associated eigenvector 
v' 1 ) = 1 = [1,1,..., 1] T . Eigenvalues of L can be or- 
dered as, — Ai < A 2 < A 3 < • • • < Xn with associated 
eigenvectors 1, v^ 2 \ v^ 3 ' • • • [2]. It is well known 

that the multiplicity of Ai is the number of connected 
components in the graph [23] . We assume in the follow- 
ing that Ai < A2 (the graph does not have disconnected 
clusters). We also assume that there exist unique cuts 
that divide the graph into k clusters. In other words, we 
assume that there exist k distinct eigenvalues close to 
zero [24]. 

Given the Laplacian matrix L, associated with a 
graph Q = (V, E) , spectral clustering divides Q into two 
clusters by computing the sign of the TV elements of 
the second eigenvector v^ 2 \ or Fiedler vector [2,4]. This 
process is depicted in Fig. 1 for a line graph where one 
edge (the edge (5,6)) has lower weight than other edges. 

More than two clusters can be computed from signs of 
the elements of higher eigenvectors, i.e. v^ 3 \ v^ 4 ) , etc. [2]. 
Alternatively, once the graph is divided into two clus- 
ters, the spectral clustering algorithm can be run inde- 
pendently on both clusters to compute further clusters. 
This process is repeated until either a desired number of 
clusters is found or no further clusters can be computed. 
This method can also be used to compute a hierarchy of 
clusters. 

There are many algorithms to compute eigenvectors, 
such as the Lanczos method or orthogonal iteration [21] . 
Although some of these methods are distributable, con- 
vergence is slow [21] and the algorithms do not con- 
sider/take advantage of the fact that the matrix for 
which the eigenvalues and eigenvectors need to be com- 
puted is the adjacency matrix of the underlying graph. 
In [12], the authors propose an algorithm to compute the 
first k largest eigenvectors (associated with the first k 
eigenvalues with greatest absolute value) 1 of a symmet- 
ric matrix. The algorithm in [12] emulates the behavior 
of orthogonal iteration. To compute the first k eigen- 
vectors of a given matrix J, at each node in the net- 
work, matrix V, = ^jeA/"(i) ^ij'Qj * s computed, where 



Note that in the case of spectral clustering we desire to 
compute the smallest k eigenvectors of L. The algorithm is 
still applicable if we consider the matrix I — L. 
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demonstrated as follows. The heat equation is given by, 

du 



at 



Au. 



Cluster A 



where u is a function of time and space, du/dt is the 
partial derivative of u with respect to time, and A is 
the Laplace operator [26]. When the above equation is 
discretized (see [1,27-29] for details) on a graph Q = 
(V, E) one gets the following equation: 



4 5 6 7 

Node Number 



10 



Ui(t+l)=Ui(t) 



, 3 Uj(t). 



for i,j £ V. Here (t) is the scalar value of u on node i at 
time t. The graph Laplacian L = [Ly] appears due to the 
discretization of the A operator [28] . The above iteration 
can be re- written, in matrix form, u(t+ 1) = (I — L) u(t) 
where u(t) = (ui(i), . . . , uat(£)) t . The solution of this 
iteration is, 



u(i) = C 1 + Ci(l- A 2 )*v 



Fig. 1. Spectral clustering: The sign of the i-th element of 
eigenvector V2 determines the cluster assignment of the i-th 
vertex, demonstrated on a simple line graph example(shown 
in the center). With + we plot the value of the components 

Of V2- 

Qj £ R, Nxk is initialized to a random matrix and Af(i) 
is the set of neighbors of node i (including node i it- 
self). Orthonormalization is achieved by the computa- 
tion of matrix Kj = V^V^ at every node, followed by 
computation of matrix K, which is the sum of all the 
matrices in the network. Once matrix K is computed, 
Qi = VjR -1 is updated at each node, where R is a 
unique matrix such that K = R T R (Cholesky decom- 
position) . The above iteration is repeated until Qi con- 
verges to the i-th eigenvector. The sum of all the matri- 
ces K, is done in a decentralized way, using gossip [25] , 
which is a deterministic simulation of a random walk on 
the network. In particular, at each node one computes 
the matrix K as follows, 



S<(i + 1) = B * s i(*)> ( 2 ) 

7T,(t+l) = ]T BjilTjit), (3) 

j'eAT(i) 

for t > r steps, where r is the mixing time for the random 
walk on the graph [12]. Here K = S 4 /tt 4 , S,(0) = K. ( 
and 7Tj(0) = 1 for only one index i and zero for other 
indices. The values are transition probabilities of 
the Markov chain associated with the graph. A natural 

choice is = l/deg(i), where deg(i) is the degree of 3 Wave equation based computation 
node i. Note that matrix B = [B^ ] is the normalized ad- 
jacency matrix (given by D _1 W). This algorithm con- 
verges after Oir log 2 N) iterations [12]. 

The slowest step in the distributed computation of eigen- 
vectors is the simulation of a random walk on the graph 
(defined by Eq. 2 and 3). It is clear from Eq. 1 that 
successive multiplications by the adjacency matrix B in 
Eqs. 2 and 3 are equivalent to successive multiplications 
by matrix I — L. This procedure is equivalent to evolving 
the discretized heat equation on the graph and can be 



(4) 



where constants Cj depend on the initial condition u(0). 
It is interesting to note that in Eq. 4, the dependence of 
the solution on higher eigenvectors and eigenvalues of the 
Laplacian decays with increasing iteration count. Thus, 
it is difficult to devise a fast and distributed method for 
clustering graphs based on the heat equation. Next, we 
derive a novel algorithm based on the idea of permanent 
excitation of the eigenvectors of I — L. We note that the 
above connection between spectral clustering and the 
heat equation is not new and was pointed out in [30,31]. 

Before discussing the details of wave-equation based 
eigenvector computation, we remark that in [32] the 
authors have independently developed a decentralized 
algorithm to compute the eigenvalues of the Laplacian. 
Compared to our approach, their algorithm involves 
solving a fourth order partial differential equation on the 
graph. This imposes twice the cost of communication, 
computation and memory on every node in the graph. 



Consider the wave equation 
d 2 u 



c 2 Au. 



(5) 



Analogous to the heat equation case (Eq. 4) , the solution 
of the wave equation can be expanded in terms of the 
eigenvectors of the Laplacian. However, unlike the heat 
equation where the solution eventually converges to the 
first eigenvector of the Laplacian, in the wave equation 
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all the eigenvectors remain eternally excited [26] (a con- 
sequence of the second derivative of u with respect to 
time). Here we use this observation to develop a simple, 
yet powerful, distributed eigenvector computation algo- 
rithm. The algorithm involves evolving the wave equa- 
tion on the graph and then computing the eigenvectors 
using local FFTs. Note that some properties of the wave 
equation on graphs have been studied in [33]. Here we 
construct a graph decomposition/partitioning algorithm 
based on the discretized wave equation on the graph, 
given by 

Ui(i) = 2u,(t - 1) - Ui(t - 2) - c 2 L «Uj(* - !) - 

(6) 



where X^eA/Xi) ^»j u j'(* — -0 originates from the dis- 
cretization of Au in Eq. 5, see [28] for details. The rest 
of the terms originate from discretization of d 2 u/dt 2 . 
To update Uj using Eq. 6, one needs only the value of u., 
at neighboring nodes and the connecting edge weights 
(along with previous values of u,). 

The main steps of the algorithm are shown as Algo- 
rithm 3.1. Note that at each node (node i in the algo- 
rithm) one only needs nearest neighbor weights and 
the scalar quantities Uj(t — 1) also at nearest neighbors. 
We emphasize, again, that Ui(t) is a scalar quantity and 
Random([0, 1]) is a random initial condition on the inter- 
val [0, 1]. The vector is the i-th component of the 
j-th eigenvector, T max is a positive integer derived in 
Section 4, FrequencyPeak(Y, j ) returns the frequency 
at which the j-th peak occurs and Coefficient^.,) re- 
turn the corresponding Fourier coefficient. 



Proposition 3.1 The wave equation iteration (6) is sta- 
ble on any graph if the wave speed satisfies the following 
inequality, 

0<c<V2, 
with an initial condition ofu(— 1) 



Algorithm 3.1 Wave equation based eigenvector com- 
putation algorithm for node i. At node i one computes 
the sign of the i-th component of the first k eigenvec- 
tors. The cluster assignment is obtained by interpreting 
the vector of k signs as a binary number. 
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Uj(0) «- Random ([0, 1]) 
Ui(-l) <- Uj(0) 
t <- 1 

while t < T max do 

u i (t)<-2u i (t-l)-u i (t-2)- 

c 2 E jeJ V(i) L «»j(*- !) 

t<-t + l 
end while 

K<-FFT([ui(l), ,Ui(T max )}) 

for j e {1, . . . , k} do 

ujj FrequencyPeak (Y, j) 

v t ^ <- Coefficient^) 

if vp } > then 

else 

end if 
end for 

ClusterNumber 



One can write iteration Eq. 7 in matrix form, 




'2I-c 2 L 




z(t) 



This implies that, 



M 



u(t-l) 
u(t-2) / 



z(t-l) 



(8) 



z(t) = M*z(0) , 



(9) 



u(0). 



where z(0) = (u(0), u(— 1)) T . We now analyze the solu- 
tion to Eq. 9 in terms of the eigenvalues and eigenvectors 
of the graph Laplacian L. 

We can compute the eigenvectors of M by solving for a 
generic vector (a.j,hj) T , 



PROOF. For analysis of the algorithm, we consider 
Eq. 6 in vector form, 



This implies that the eigenvectors of M are given by, 



u{t) = -u(t - 2) + (21 - c 2 L)u(t - 1) . 



(7) 



We stress again that, in practice, the algorithm is dis- 
tributed and at every node one updates the state based 
on Eq. 6. The update equations given by Eq. 6 (and 
Eq. 7) correspond to discretization of Eq. 5 with Neu- 
mann boundary conditions [34]. 



m 



with eigenvalues 



(j) 



r(J) 



2-c 2 \i , c 



a 3l,2 = 



(10) 



±-Jc 2 A 2 -4A,. (11) 
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Fig. 2. Plot of functions \0j/2 ± l/2 1 /(6» j ) 2 - 4|. Blue 
(dashed) line is the function with a negative second term. 
Red (solid) line is the function with a positive second term. 

It is evident from Eq. 11 that stability is obtained if and 
only if, 



2 - c 2 A, ± y/{2 - c^Xjf 



< 1 



The absolute value from the above equation is plotted 
for various values of 6 3 ; = 2 — c 2 \ 1 in Fig. 2. The above 
stability condition is satisfied for — 2 < 9j < 2, which 
yields the following bound on c: 



< c < 



The above equation must hold true for all eigenvalues 
of L. The most restrictive of which is c < 2/VAjy- Since 
A at < 2 for all graphs, 

< c< V2, 

guarantees that all the eigenvalues of M have absolute 
value equal to one. However, Eqn. 6 will be unstable if 
any of the eigenvalues of M have geometric multiplicity 
strictly less than the algebraic multiplicity with an initial 
condition that has non-zero projection on the unstable 
generalized eigenvectors. We now derive conditions so 
that these instabilities do not arise. 

From Eqn. 11 it is evident that there are three cases to 
analyze. 

Case i): Since L always has an eigenvalue at 0, this im- 
plies that M always has an eigenvalue at 1 with alge- 
braic multiplicity two. It can be shown that the geo- 
metric multiplicity of this eigenvalue is equal to one. 
The corresponding eigenvector is l2ivxi, with a gen- 
eralized eigenvector (1, — 1) T . To avoid instability, the 
initial conditions must be of the form (u(0), u(0)) T . 
In other words, we set u( — 1) = u(0) to ensure that 
the initial condition is orthogonal to (1, — 1) T . 



Case it): If L has k repeated eigenvalues, it implies 
that M has k repeated eigenvalues. In this case, how- 
ever, the geometric and algebraic multiplicities are 
equal. One can show that the matrix L is similar to 
the symmetric matrix, 



D 1/2 (D W)D 



1/2 



(in particular, L = D~ 1 / 2 L S y m D 1 / 2 ), implying that 
L is diagonalizable. Thus, the eigenvectors of L asso- 
ciated with the repeated eigenvalues are linearly inde- 
pendent. Since matrix M has eigenvectors of the form 
shown in Eqn. 10, the repeated eigenvalues of M have 
eigenvectors that are linearly independent. 
Case Hi): The matrix M has a repeated eigenvalue at - 
1 if c 2 = 2 and A at — 2. This repeated eigenvalue 
has an associated eigenvector (— vjv, ^n) T and a gen- 
eralized eigenvector {vn,^n, ) • Clearly, in this case 
the initial condition would need to be orthogonal to 
both the vector (1,— 1) T and the vector (v/v,v/v) T . 
This can be achieved if and only if u(0)_I_vat and 
u( — 1) = u(0). This is an undesirable condition, as it 
requires prior knowledge of vjv. We avoid this situa- 
tion by setting c < 

Thus, we can guarantee stability of the wave equation 
iteration on any graph (given by Eqn. 6), as long as < 
c < \/2 and the initial condition has the form u(— 1) = 
u(0). 

Notice that the condition u(— 1) ^ u(0) is analogous 
to a non-zero initial derivative condition on u for the 
continuous PDE, which is known to give a solution that 
grows in time [26] . 

Remark 3.2 Although we call c the wave speed, it only 
controls the extent to which neighbors influence each 
other and not the speed of information propagation in 
the graph. 

Proposition 3.3 The clusters of graph Q , deter- 
mined by the signs of the elements of the eigenvectors 
of L, can be computed using the frequencies and coef- 
ficients obtained from the Fast Fourier Transform of 
(iij(l), . . . , Uj(T maa: )), for all i and some > 0. 

Here \ii is governed by the wave equation on the graph 
with the initial condition u(— 1) = u(0) andO < c < \f2. 



PROOF. We can write the eigenvectors of M as, 
m (J) = p(j) ± jqCi) ( 

where, 



r U) 



, q 



V 



5 



Using ctj = e lUj , we can represent the solution of the 
update equation (Eqn. 6), or equivalently, 



z(t) = M*z(0) , 



(12) 



corresponding to peaks, algorithms like multiple signal 
classification [35] can overcome these difficulties. The 
investigation of such algorithms, as well as windowing 
methods, is the subject of future work. 



by expanding z(0) in terms of and qW). Recall, 
that z(0) = (u(0),u(0)) T is orthogonal to the general- 
ized eigenvector (1,— 1) T . Thus, z(0) is represented as 
a linear combination of (1, 1) T and m^) for j > 2. This 
implies that the solution to Eqn. 8 and 9 is given by, 

N 

z(t) = C h \p U) cos(tojj) - q (3) sin(fa^) 

+ C j2 [p (j) sin(t Wj ) + q (j) cos^Wj) , (13) 



where 



C h =*(0) T pV\ Q 2 =z(0) T q' 



(3) 



(14) 



It is easy to see that at every node, say the i-th node, 
one can locally perform an FFT on (iij(l), . . . , Ui(T max )) 
(where each value is computed using the update law 
in Eq. 6) to obtain the eigenvectors. At the i-th node 
of the graph, one computes the i-th component of ev- 
ery eigenvector from the coefficients of the FFT. More 
precisely, for node i, the coefficient of cos(tWj) is given 



4 Performance analysis 

An important quantity related to the wave equation 
based algorithm is the time needed to compute the eigen- 
values and eigenvectors components. The distributed 
eigenvector algorithm proposed in [12] converges at a 
rate of 0(r log 2 (N)) , where r is the mixing time of the 
Markov chain associated with the random walk on the 
graph. We derive a similar convergence bound for the 
wave equation based algorithm. 

It is evident from Eq. 13 that one needs to resolve the 
lowest frequency to cluster the graph. Let us assume that 
one needs to wait for 77 cycles of the lowest frequency to 
resolve it successfully (i.e. the number of cycles needed 
for a peak to appear in the FFT) 2 . The time needed to 
cluster the graph based on the wave equation is, 



T =± 

-L max 

L02 



(15) 



From Eq. 11 it is easy to sec that cos((x> 2 ) = Real(a2) 
(Cj 1 + Cj 2 )v\ J> . The sign of the coefficients of the eigen- (2 — c 2 A 2 )/2. Note that in [36] it was shown that t 



vector(s) provide the cluster assignment (s). 



(log |1 - A2I)" 1 . Thus, it follows that, 



Remark 3.4 The above algorithm assumes that one ex- 
cites every frequency (or depending on the number of 
clusters, at least the first k frequencies). This is achieved 
if z(0) is not orthogonal to and q^ (Cj 1 and Cj 2 
must be non-zero). As mentioned before, an initial con- 
dition of the form z(0) = (u(0),u(0)) T prevents linear 
growth of the solution, however, u(0) should also not be 
orthogonal to v^ 2 -* , v' 3 ^ . . . v' fe ' . This is easy to guarantee 
(with probability one) by picking a random initial condi- 
tion at each node. 

Remark 3.5 Note that the wave equation can also be 
used as a distributed algorithm for eigenvector and eigen- 
value computation of L. From the FFT we can com- 
pute ujj which in turn allows us to compute the eigenval- 
ues Xj . The eigenvector components are computed using 
the coefficients ofcos(tbjj) (or equivalently sm(tujj)). 

Remark 3.6 The algorithm is also attractive from a 
communication point of view. In [12] entire matrices need 
to be passed from one node to another. In our algorithm 
only scalar quantities Uj need to be communicated. 

Remark 3.7 Peak detection algorithms based on the 
FFT are typically not very robust because of spectral 
leakage. As we are only interested in the frequencies 



(2 + c 2 {e- 1 ' T - 1) \ 
cj 2 = arccos I I . 

Hence, the convergence of the wave equation based 
eigenvector computation depends on the mixing time of 
the underlying Markov chain on the graph, and is given 

by, 

/ /2 + c 2 ( e - 1 /--l)\" 1 \ 
Tmax = O I arccos I K — -J . (16) 

In the wave equation based clustering computation, one 
can at the i-th node, compute the i-th component of ev- 
ery eigenvector (along with all the eigenvalues) of the 
graph Laplacian, thus assigning every node to a cluster. 
If one uses the wave equation to compute eigenvectors, 
to ensure that at every node one has entire eigenvec- 
tors, an extra communication step needs to be added. 
As a final step, locally computed eigenvectors compo- 
nents are transmitted to all other nodes. The cost of this 



2 The constant n is related to the FFT algorithm and in- 
dependent of the graph. Typically 6-7 cycles of the lowest 
frequency are necessary to discriminate it. 
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Fig. 3. Comparison of convergence rates between the dis- 
tributed algorithm in [12] and our proposed wave equation 
algorithm for c 2 = 1.99. The wave equation based algorithm 
has better scaling with r for graphs of any size (given by N). 
The plots are upper bounds on the convergence speed. 

step is O(N) (worst case). Thus, convergence of the dis- 
tributed eigenvectors computation scales as, 

, /2 + c 2 (e- 1 / T -l)\ , 
= O ( arccos [ y — '- ) ) + O(N) . 

(17) 

Note that simple analysis shows that for large r our 
algorithm has a convergence rate of y/r/c (as O(N) gets 
dominated by r). It is interesting to note that in the 
discretized wave equation, though the constant c loses 
the meaning of wave speed (that it has in the continuous 
case) it does impact the speed of convergence. 

The convergence of wave equation based clustering is 
compared to convergence of distributed spectral clus- 
tering in Fig. 3, for c 2 = 1.99. In particular, the figure 
shows that wave equation based clustering has, in gen- 
eral, better scaling, with respect to t, than [12]. 

Note that the proximity of uj-j to uj^ (or the proximity 
of A3 to A2) will influence the constant in Eq. 16. The 
resolution of the FFT is 0(1/ K ) , where K is the number 
of samples. Thus, K has to exceed 1/|cj 3 — o; 2 |, to enable 
computation of two separate peaks. The closer A3 is to 
A2, the greater are the number of samples that each node 
needs to store in order to obtain a good estimate of uj 2 
using the FFT. A similar constant depending on the ratio 
of A2 and A3 arises in distributed spectral clustering [12] 
and any power iteration based scheme for eigenvector 
computation [21]. 

Practically, if the lowest frequency of the FFT does not 
change for a pre-defined length of time, we assume that 
convergence has been achieved. 

From Eq. 16 it seems that the proposed clustering algo- 
rithm is independent of the size of the graph (since \fr I c 
dominates 0(N)). This, however, is not true. Larger 




Fig. 4. The ring graph Cjv with N nodes. Every edge has a 
weight of 1. 

graphs with low connectivity tend to have higher mix- 
ing times. Take for example, a cyclic graph Cm shown in 
Fig. 4. We use the cyclic graph as a benchmark as one 
can explicitly compute the mixing time as a function 
of N and make a comparison with [12]. Of course, no 
unique spectral cut exists for such a graph. The second 
eigenvalue of the Laplacian for Cm is given by, 



. 2tt 

A2 = 1 — cos — 
N 



(18) 



Thus, the mixing time of the Markov chain is given by, 

2 



1 



In (cos (2tt/JV)) 



(19) 



From Eq. 16, one can show that the time for convergence 
of the wave equation is, 



T — 

-l m.n.x — 



n 



N 

arccos(l + 0.5c 2 (cos(27r/7V) - 1)) ~ 



(20) 



As expected, Eq. 20 predicts that as the graph becomes 
larger, the convergence time for the wave equation based 
algorithm increases. We numerically compute and com- 
pare the convergence times for random walks and wave 
equation on the cyclic graph (by explicitly running the 
iterations for both processes and checking for conver- 
gence). The results are shown in Fig. 5. 

5 Numerical results 

Since our algorithm should predict the same partitions 
as spectral clustering, we demonstrate the algorithm on 
illustrative examples. Our first example, is the simple 
line graph shown in Fig. 6. Nodes 1 to 100 and 101 to 
200 are connected to their nearest neighbors with edge 
weight 1. The edge between nodes 100 and 101 has weight 
0.1. As expected, spectral clustering predicts a cut be- 
tween nodes 100 and 101. We propagate the wave on the 
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Fig. 5. Convergence of random walk and wave equation on 
the cyclic graph Cn as a function of number of nodes, N. 
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■101 199 — 200 
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Fig. 6. A line graph with nearest neighbor coupling. The edge 
between 100 and 101 is a weak connection with weight 0.1, all 
other edges have weight 1.0. Vertical line shows the predicted 
cut. 
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Fig. 7. FFT of [Ui(l) . . . Uj(T)] for any node i of the line 
graph. Red circle marks the lowest frequency. 

graph using update Eq. 6 at every node. At each node, 
one then performs an FFT on the local history of u. The 
FFT frequencies are the same for all nodes (evident from 
Eq. 13) and shown in Fig. 7. The sign of the coefficients 
of the lowest frequency in the FFT are shown in Fig. 8. 
It is evident from this figure that the sign of the coef- 
ficients change sign exactly at the location of the weak 
connection, predicting a cut between nodes 100 and 101 
(consistent with spectral clustering). 



Fig. 8. Signs of the coefficients of the lowest frequency for 
the line graph. 

assume that all the edges have weight 1 . 

W. Zachary, a sociologist, was studying friendships at a 
Karate club when it split into two. As expected, mem- 
bers picked the club with more friends. This example 
serves as an ideal test bed for clustering algorithms. Any 
effective clustering algorithm is expected to predict the 
observed schism. Community detection and graph clus- 
tering algorithms are routinely tested on this example, 
see [15,39-41] for a few such demonstrations. 

We first apply spectral clustering on this example, then 
run our wave equation based clustering algorithm, and 
compare the results in Fig. 9. As expected, we find that 
both algorithms partition the graph into exactly the 
same clusters. 

We also demonstrate our algorithm on a large Fortu- 
nato benchmark with 1000 nodes and 99084 edges. The 
graph has two natural clusters with 680 and 320 nodes 
respectively. These clusters are shown in Fig. 10. The 
wave equation based clustering computes the graph cut 
exactly. 

Thus, wave equation based eigenvector computation can 
be used to partition both abstract graphs on parallel 
computers, or physical networks such as swarms of un- 
manned vehicles, sensor networks, embedded networks 
or the Internet. This clustering can aid communication, 
routing, estimation and task allocation. 

We now show how clustering can be effectively used to 
accelerate distributed estimation and search algorithms. 



6 Distributed estimation over clusters 



We now demonstrate our distributed wave equation 
based clustering algorithm on the Zachary Karate club 
graph [37] and on a Fortunato benchmark example [38]. 
These social networks are defined by the adjacency 
matrix that is determined by social interactions. We 



Distributed estimation has recently received signifi- 
cant attention see [42]-[45] and references therein. Dis- 
tributed estimation algorithms require the entire net- 
work of sensors to exchange (through nearest neighbor 
communication) data about the measured variables in 
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19 21 



Fig. 9. Graph decompositions predicted by spectral and wave 
equation based clustering algorithms. Both methods predict 
the same graph cut. 




Fig. 10. A Fortunato community detection benchmark with 
1000 nodes and 99084 edges. Wave equation based clustering 
computes the graph cut exactly. 

order to obtain an overall estimate, which is asymptoti- 
cally (in the number of iterations) optimal. This results 
in estimators with error dynamics that converge to zero 
very slowly. It is well known that these type of algo- 
rithm can be accelerated using multi-scale approaches, 
see for example [46-48]. The key idea in these multi- 
scale approaches is to partition the sensor network into 
clusters, solve the distributed problem in each cluster 
and fuse the information between clusters. 

As the overall estimation process is distributed, it is de- 
sirable that the multi-scale speedup is achieved through 
a distributed process as well. This means that the clus- 
tering must be computed, in a bottom-up fashion, from 
the structure of the network. We show in the following 
a simple yet illustrative example, where the wave equa- 
tion based clustering algorithm can be used to accelerate 
distributed estimation computation by exploiting prop- 
erties of the overall sensor network. 
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Fig. 11. A two floor building subdivided into 64 cells/rooms 
for each floor. In each room there is a sensor node capable of 
communicating with neighbors within a radius of 10m. The 
thick black line, depicts walls that degrade communication 
strength. 

We consider the contaminant transport problem in 
a building [46] with two floors, each divided into 64 
cells/rooms (see Fig. 11). A sensor, to detect the con- 
taminant, is present in each cell. Sensors can communi- 
cate if their relative distance is less than 10m. However, 
we assume that only four sensors can communicate 
between floors, namely those placed within common 
staircases connecting the two floors. On the first floor, 
sensors can communicate across the empty space in be- 
tween (we assume that windows are present), whereas 
on the second floor we assume that there are walls that 
reduce the communication range. We further assume 
that walls marked with a thick black line, see Fig. 11, 
degrade communication between the nodes that are 
inside the area to those outside. As in [46], we assume 
that the contaminant is produced in four rooms, two 
on the first floor and two on the second. Under the 
simplifying assumption of perfect mixing within each 
cell/room volume, the contaminant propagates within 
the building (see [46] for details) according to: 

^ >i ^ i ~dt ~ FjiCj — Fijd + Gi — RiCi , 

p : Density C : Contaminant concentration 
V : Volume Fji : Mass flow rate from node j to i 
G : Contaminant generation rate 
R : Mass removal rate . 

A constant inward flow of air is introduced at a corner 
of the second floor, and outflow openings exist wher- 
ever windows are open to the outside. We consider a 
distributed Kalman filter as one that uses consensus to 
average the estimates and covariance matrices between 
Kalman filter updates, see [46] for details. 

The idea of using the wave equation for distributed 
clustering is to "discover" in a bottom-up fashion, the 
presence of clusters and exploit strong inter-cluster 
connectivity to accelerate computation. In particular, 
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we demonstrate the benefit of using the "bottom-up" 
approach. In the building example there are two main 
clusters (first and second floors), which a filter and net- 
work designer can a-priori assume to know. The four 
clusters on the second floor, however, would not be 
known to the designer unless extensive communication 
measurements are carried out. 

In order to determine the four clusters based on SNR, 
on the second floor, the wave equation based clustering 
was run for 600 steps. The clustering clearly needs to 
be run only once, unless there is very strong variation 
of SNR or the network. In this particular example we 
assume that the SNR and the network do not vary. 

6. 1 Numerical results 

Numerical results are obtained by running the Kalman 
filter interleaved with the consensus step, see [46]. We 
fix 10 iterations for the consensus step in each cluster. 
Fig. 12 shows the estimation result for 100 updates of the 
Kalman filter 3 for both clustering strategies described 
previously. In particular, Figs. 12a and 12b show the 
estimate (solid line) of the concentration (the true value 
is shown with dash-dot line) in room 49 made by all 
the sensors in the building. It can be clearly seen that 
the estimate in Fig. 12a is not as accurate as the one in 
Fig. 12b. The reason is that the consensus step for the 
case of four clusters on the second floor converges much 
faster to the true covariance compared to the case of two 
clusters. In comparison, if consensus is run for the case of 
two clusters, it requires more than 500 iterations in each 
consensus step to converge to the accuracy of Fig. 12b. 

In the 5 cluster case, all the nodes in the building have 
accurate estimates of the contaminant concentration for 
rooms located on the first floor. This is because sensors 
on the first floor are strongly connected to one another 
and 10 iterations are enough to converge to the true 
covariance (with only slight corruption by the "uncon- 
verged" averaging on the second floor) . 

These simulations show that the wave equation based 
clustering provides an efficient distributed bottom-up 
methodology for partitioning sensor networks and accel- 
erating distributed estimation algorithms. 

7 Mobile Sensor Networks 

We demonstrate the utility of distributed partitioning 
for computing the trajectories of mobile sensors/ vehicles 



3 We assume that the consensus step is fast compared to 
the contaminant spreading so that no compensation of delay 
is required at the nodes while running the Kalman filter. It 
is clear that for estimation, shortening the consensus step is 
crucial in order to have a consistent estimate. 



for the purpose of efficiently searching a large area. 
In [49] the authors have developed an algorithm to op- 
timally search a region, given a prior distribution that 
models the likelihood of finding the target in any given 
location (see for example Fig. 13). The trajectories are 
computed using a set of ordinary differential equations 
given by, 

x j (t)=u j (t). (21) 

The above equation describes the dynamics of the j-th 
vehicle, where xj(t) and Uj(t) are the position and the 
control input of the j-th vehicle at time t respectively. 
The authors prove that the control law 



Uj(t) = —Un 



(22) 



\BM\ ' 

efficiently samples the prior distribution for search. Here, 

AfcSfc(t)V/ fc (a: j (t)) 



(fk, fk) 



(23) 



where fk are the Fourier basis functions that satisfy 
the Neumann boundary conditions on the domain to be 
searched and k is the corresponding basis vector num- 
ber. The quantities Sk(t) are governed by the following 
differential equation, 



ds k (t) EjLiAteW) 



dt (fk,fk) 
where N is the number of vehicles. 



(24) 



In [49] the trajectories arc computed a-priori for a given 
distribution (belief map), using Eqns 21, 22, 23, 24. Here 
we perform online computations for trajectories gener- 
ation in a distributed setting . The sum X^jLi fk{ x j{t)) 
over all vehicles in Eq. 24 is the centralized quantity that 
needs to be computed in a distributed manner. At ev- 
ery time instant (every time step of the Runge Kutta 
scheme), the vehicles are partitioned into groups using 
the wave equation based clustering algorithm and the 
sum in Eq. 24 is computed over the clusters and the so- 
lutions added. All the vehicles then compute a piece of 
their trajectory for a predetermined horizon of time (for 
a single Runge Kutta time step). These pieces of trajec- 
tories for each agent are merged together to give Fig. 14. 
In this way, the mobile sensors group themselves into 
clusters and compute their trajectories in a distributed 
manner. 



8 Conclusions 

In this work, we have constructed a wave equation based 
algorithm for computing the clusters in a graph. The al- 
gorithm is completely distributed and at every node one 
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Fig. 12. Concentration estimates for room 49. Concentration estimates (solid line) are compared to the true value (dash-dot 
line). The dashed lines give the ±3cr curves around the estimate. As it is clear from the plots, the strategy in which the 
consensus step is run using five clusters (b) is much better than using two clusters (a). 




Fig. 13. Prior belief map (distribution) for targets. 

can compute cluster assignments based solely on local in- 
formation. In addition, this algorithm is orders of magni- 
tude faster than state-of-the-art distributed eigenvector 
computation algorithms. Starting from a random initial 
condition at every node, one evolves the wave equation 
and updates the state based solely on the scalar states of 
neighbors. One then performs an FFT at each node and 
computes the sign of the components of the eigenvectors 
of the graph Laplacian. Complete eigenvector informa- 
tion can be transmitted to each node using multi-hop 
communication. This process is equivalent to spectral 
clustering. 

The algorithm is also attractive from a distributed com- 
puting point of view, where parallel simulations of large 
dynamical systems [50] can be coupled to the distributed 
clustering approach presented here, to provide scalable 
solutions for problems that are computationally and the- 



Fig. 14. Trajectories generated using distributing spectral 
search algorithm that uses wave equation based clustering. 

oretically intractable. This application is the subject of 
current research. 

Wave equation based clustering is demonstrated on com- 
munity detection examples. Applications to multi-scale 
distributed estimation and distributed search are also 
demonstrated. 

Current work includes the extension of the wave equa- 
tion based algorithm for dynamic networks. This is 
clearly very important in situations where the weights 
on the edges of the graph are time varying. Examples 
of systems where dynamic graphs arise include UAV 
swarms, nonlinear dynamical systems and evolving 
social graphs. 
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